THE ROTATION OF EIGENSPACES OF PERTURBED MATRIX PAIRS 
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Abstract. We revisit the relative perturbation theory for invariant subspaces of positive 
definite matrix pairs. As a prototype model problem for our results we consider parameter 
dependent families of eigenvalue problems. We show that new estimates are a natural way 
to obtain sharp — as functions of the parameter indexing the family of matrix pairs — 
estimates for the rotation of spectral subspaces. 

> 

o 

0> ■ 1. Introduction and motivation 

This paper is concerned with the notion of the optimahty of bounds on the rotation of 
^ I spectral subspaces of positive definite Hermitian matrix pairs under the influence of ad- 
^ ■ ditive perturbations. Precisely, given positive definite Hermitian matrix pairs {H, M) and 
(if, M) = {H+6H, M+6M) and their spectral subspaces £ and £ of the same dimensionality 
we provide estimates 



sin0M(£, £)|| < Gap-^^—r J^ + Gapg- 



^ ^ where rjA = \\A~^^'^{A — is the usual relative distance between positive definite 

■ Hermitian matrices A and A, Gapj measure the gaps in the spectrum and || sin9M(£, £)|| 
measures the size of the rotation in the scalar product {x,y)M = x*My dependent on the 

^ ! matrix M. For more on sin 6 theorems see [5l [9l [121 HSl HI]. In comparison, we approach the 
problem of the changing scalar product by presenting our estimates in the M-scalar product, 
whereas the standard approach yields estimates in the Euclidean scalar product. 

Let us now consider the notion of the optimality of perturbation estimates in the con- 
text of parameter dependent perturbation families. In this setting we analyze rotations of 
eigenspaces of positive definite Hermitian matrix pairs {H, M) under the infiuence of a pa- 
^ ' rameter dependent family of perturbations. The allowed families of perturbations 611^ and 

■ — where k is some indexing parameter — are assumed to satisfy the restrictions 

(2) \x*5H^y\ < 3'{h)^Jx*Hx y*Hy \x*SM^y\ < 9{k)^/x*Mx y*My 

(3) lim 3^(/t) = lim g(K) = 0. 

Here the matrix valued functions SH,^ and SM^ are assumed to take value in the space of 
Hermitian matrices of appropriate size, and by a convention x* denotes the transpose or 
Hermitian transpose of an object x — be it matrix or vector — as is given by the context. 
We also assume that 5" and S are some real valued functions and we apply ([1]) by setting 
i^K := H + SHi^ and := M + 6M^ and noting the estimates rjH^ < and rjM^ < S(/t) 
if we set H = and M = M^. 
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It is our aim to argue that matrix dependent scalar product gives a natural environment 
to obtain optimal convergence estimates as functions of the parameter k. Further feature of 
our theory is that our estimates are invariant^ to the "inversion of the problem" , so we can 
also obtain estimates in the H based scalar product by switching the roles of H and M. In 
this context the reader should also note that the identity under the assumption that k 
is such that < 1 and S(/t) < 1 yields the estimates (cf. equation ([S3])) 



Let us now give more insight into the applications which are covered by the assumptions 
(I2]). This structure is rich enough to include discretization matrices approximating several 
singularly perturbed families of problems appearing in mathematical physics. Among other 
applications, the penalty methods for Stokes and Maxwell equations from [16] can be ana- 
lyzed in this context, too. The parameter k is then called the penalty parameter, and it is 
of interest what happens to the eigenvalues and eigenspaces as k — > oo. In [161 Section 4] 
the authors have studied the perturbation of eigenvalues by a very elegant Gerschgorin type 
argument and in this paper we give an eigenspace counterpart of such a result. For more 
details see jXj and the explicitly solved academic model problems from Section 14.11 

Also, the effect of numerical integration on the rotation of eigenspaces, when assembling 
finite element mass and stiffness matrices, is covered by (]2]). In this context k is the parameter 
describing the effect of increasing accuracy of the integration formula. This approach is also 
used for "mass lumping" which amounts to constructing a diagonal matrix D = M + 6M, 
with SM small in some sense. For further information and references see the paper [1] , |X] 
and the academic example from Section 14.21 where we rather favorably compare our results 
with those that follow from the standard reference [T5] . 

We end this discussion by noting that similar energy norm estimates for eigenvectors have 
been obtained in [TU] in the context of the analysis of Laczos method. Furthermore, the 
authors show how to efficiently compute the ingredients of the estimator in the context of 
computationally competitive numerical linear algebra procedures. We extend some of those 
results by giving a subspace version of some of the estimates, e.g. see appropriate parts 
of [T0| Proposition 3.3 and 3.4] and compare with our numerical results from Section |H It 
is possible that our subspace results could be of technical help when developing a similar 
analysis of the block Lanczos method. 

We now turn to the main question of this paper. What is the real nature of the sharpness 
claim of a sin theorem? Many of such theorems are obtained under essentially different 
spectral assumptions. Each is claimed to be sharp by constructing an appropriate example 
where the bound is attained. The results cannot be readily compared, even though one class 
of results can be seen to be following from the other, since their optimality depends on the 
set of assumption which were necessary to obtain the results. Quantitatively, transforming 
one class of results into the other type of estimates changes the quantitative performance of 
the results so considerably that a direct comparison is no longer fair. In a sense, each result 



The value of Gapj does not change under this transformation of the problem. 
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is "sharp" given the setting in which it has been obtained so discussion is more about which 
set of assumptions are more appropriate than the others. 

We do not further address this fundamental questions. Instead, we opt to normahze the 
estimates by dividing the measure of the rotation which is being estimated with the estimator 
and then compare various estimates on specially tailored model problems. A first logical 
candidate — in a single matrix case — for a competing estimate would be a sin G theorem 
from 121 1131 [P]. However, it turns out that this estimate — as the function of k — is overly 
pessimistic. Our aim is in particular to derive sharp estimates for the rotation of eigenspaces 
for this class of problems given by the parameter dependent family if^ = H + 6Hi^. The 
solution is to look for the rotation of eigenspaces in the energy norm, that is H based. In 
our setting this boils down to the analysis of the matrix pair {H~^, H^,). Let us also point 
out that we will discuss sharpness, or lack of it, in the various sinG results by comparing 
the residual type estimates which can be obtained for matrix pairs 

We will conclude that any estimate of an eigenspace rotation under the assumptions ([2]) is 
actually meant to be in a matrix dependent scalar product and that it will under-perform if 
used to measure rotations in the Euclidean scalar product. 

2. Notations, definitions and the general setting 

The optimal setting to consider all of the above eigenvector problems is an analysis of the 
whole class of positive definite matrix pairs {H,M), where H and M are positive definite. 
More to the point, we consider the following generalized eigenvector problem 

(6) Hx = \Mx, 
and the corresponding perturbed one 

(7) {H + 6H)x = X{M + 6M)x, 

where H,M,H = H + SH,M = M + 5Me C"''" are Hermitian positive definite. 

2.1. Spectral theorem and block operator matrix notation. Under these assumptions 
matrix pairs {H, M) can be simultaneously diagonalized, that is there exists a non-singular 
matrix X such that 

(8) X*HX = A, X*MX = /, 

where A = diag (Ai, . . . , A„) Aj G M for i = 1, . . . ,n and we use X* to denote the Hermitian 
adjoint. 

We will represent our perturbation problem by block operator matrices and will use the 
following notation for the perturbation problems which will be needed in the analysis. Let 
us decompose X and X as 

X = [X, X2] X = [X, X2] , 

where Xi, Xi G C"^^ and X2, X2 G C"^""'^. The eigen-decomposition ([8]) can now be written 

as 



(9) 



X* 
X* 



H[Xi X2] 



"Ai 


0" 




"X*" 





A2_ 




X2*_ 
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M[Xi X2] 



h 

In-k 



Similarly as above, for perturbed quantities one can write 



(10) 
and 



X* 

X* 



'XI 
X* 



H[X^ X2] 



H[X^ X2] 



Ai 


" 




'XI 


_0 


A2. 


; 




Ai 


0" 




'xf 


_0 


A2. 
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where Xi e C"^'= and X2 G C 



"^nxn—k 



M[Xi X2] 



M[X^ X2] 
and X = [Xi X2] ■ 



Ik 




4 










Ifi—k 



2.2. Measuring perturbations of positive definite matrices. The size of the pertur- 
bations 6H and 6M will be measured in the relative sense. This means that we assume that 
we have information on the singular values of the matrices 

H-^/\H - H)H~^'^ and M-^/^{M - M)M-^/^ 



(12) 
or 
(13) 



H-^/^{H - H)H-^/^ and M-^''^{M - M)M- 



1/2 



Typically we only use the maximal singular value, that is the spectral norm estimate of these 
relative perturbations. More to the point we will use the quantities defined in the lemma 
below in most of our arguments. In this paper we use || ■ II2 to denote the spectral matrix 
norm, and || ■ || to denote any unitary invariant matrix norm, when there is no danger of 
confusion. 



Lemma 2.1. Let H he a positive definite matrix and let ^ h 
r^jj = \\H-^/^{H - H)H-^/^\\2. Then for any x, y G 



\H-^I\H - H)H'y^\\ and 



(14) 
(15) 

(16) 



x*{H — H)x\ < TjH x*Hx, 

Vx*Hx x*Hx, 



\x*{H -H)y\ < 
\H-^'\H-H)H-^'^\\ < 









1 




- TlH 


< 


Vh 



\\H-^'\H - H)H-^'^\ 
in the case II • II = II ■ I 



2- 



In particular, relation 17^) reduces to ^ h < ^ i-riH 

The proof is by direct computation, see also [H]. Let us note that our theory is not limited 
to the use of spectral norm only. We allow for the consideration of any unitary invariant 
norm of the perturbations (fT2!) and (fT3|) . 



Remark 2.2. 

(17) 



When there is a danger of confusion we will use the notation 



/2| 



to denote the dependence of the perturbation measure on the unitary invariant norm. 

Remark 2.3. Let us note that in an application of this theory in the setting of the mass 
lumping finite element methods we consider the perturbations of the M matrix. The constant 
Vm for such a perturbation typically depends on mesh parameters. Furthermore, let us note 
that if there exist constants 61, 60, < 60 < 61 such that 

60 x*Dx < x*Mx < 61 x*Dx 
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holds for some symmetric positive definite matrices D and M, then M = D has the 
property 

-^^x*Mx <x*Mx< -^^x*Mx 
do + di do + di 

which can be written as 

\x*(M - M)x\ < -i ^ x*Mx . 

Ol + ^0 

2.3. Relations between subspaces in the changing scalar product. Let us now define 
the basic tools which will be used to compare subspaces of C". Let X and ^ be some generic 
m-dimensional subspaces of C". For any of such subspaces there are base^X, Y G C"^™ such 
that X = Ran(X) and V = Ran(F). Let us choose X and Y such that X*X = Y*Y = Im 
then Px = XX* and Py = YY* are orthogonal projections onto X and Typically, 
one compares the subspaces X and ^ by analyzing the spectral properties of the product 

S{x,]j) = {I — Px)P]}- The m-singular values of Sx^y — the restriction of Sx,)j on X — are 

called the sines of the angle between the subspaces X and In the matrix notation they 
are exactly the m-singular values of the matrix 

Sx,y = {I-XX*)Y. 

This is the measure of the size of the rotation^ in C" which would move the subspace X onto 

In this note we analyze the angles between the subspaces X and ^ in the scalar product 
{x,y)M = x*My, x, y G C" which is defined by the positive definite matrix M. To this end 
let X*MX = Y*MY = Im, which is to say let X and Y be M-unitary. Then the sines of 
the angle between X and V in the M-scalar product are the m-singular values of the matrix 
product 

= M^/2(/ - XX*M)Y 

For more on angles between the subspaces of C" see [5l ITT] . 

Let us not that since both M and H are subject to perturbation particular care is needed 
because the underlying space geometry changes with M. We shall therefore simplify the 
subsequent discussion of the M-product dependent subspace angles. 

In order to be definite we shall concentrate — and give explicit formulae for the angles — only 
on the relationship between the subspaces of interest for our analysis. That is we consider 
the relationship between the subspaces Ran(Xi), Ran(Xi) and Ran(Xi). 

The columns of Xi and Xi are M-orthogonal, then we use the following characterization 
of the sines of the canonical angles between the M orthogonal subspaces Xi = Ran(Xi) and 
Xi = Ran(Xi) induced by weighted M-inner product: 

(18) sin eM(Xi, Xi) = x;mxi . 

Let us now consider the problem of the changing scalar product. Since M = M + SM, it 
follows from (fTOj) that 

X*MX = I - X*6MX . 



^By saying the basis Y we mean "the basis given by the columns of Y" . 

"'Such rotation exists if aU of the sines of the angle between X and y are strictly smaller than one. 
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Assume that I —X*5MX is positive definite. Then X and XY * are M-orthogonal, where Y 
is Cholesky factor such that YY* = I — X*6MX . We can now use a similar characterization 
of the sines of canonical angles as in ( |T8l) . We show that the sines of the canonical angles 
between the eigenspaces Xi = Ran(Xi) and Xi = Ran(Xi) induced by weighted M-inner 
product are given by: 

(19) sineA/(Xi,Xi) = X*^XiFi7*, where Y = 

Finally, let Px = XX* be orthogonal projector onto m-dimensional subspace X = Ran(X), 
where X satisfies X*X = Im- Using the [T5| Theorem II 4.10.], one can write 

(20) ||sine(X,y)|| = \\Px-Pd = ||(/-Px)Py|| = Ul-Py)Px\\ , 

for any unitary invariant norm || ■ ||. Further, note that the columns of the matrices X^^ = 
M^^^Xi, Xf^ = M^/^Xi and Xf^ = M^/^XiY{l* are unitary, thus using ((201), one can write 

where Px, = Xf (Xf )* and where Pj^ = Xf (Xf )*, and similarly 

||sineM(Xi,Xi)|| = ||pj^ -pjjl = \\x;mx,y,-,*\\, 

where Pj-^ = Xf (Xf )* and Pj^ = Xf{Xf)*. 

The results above imply the upper bound for the sines of the canonical angles between 
the eigenspaces Xi = Ol{Xi) and Xi = Ran(Xi), can be obtained in any unitary invariant 
norm || ■ || , using the simple triangle inequality. We have 

(22) II sin 6Af(Xi, Xi) II < || sin 0m(Xi, Xi) || + || sin 0m(Xi, Xi) || , 

that is II sinBA/(Xi, Xi)|| can be estimated as the sum of the upper bounds for the norms of 
the sines matrices from f|T8|) and f|T9l) . 

3. The main result 

Our aim is to derive a bound for the sines of the canonical angles between eigenspaces 
Xi = Ran(Xi) and Xi = Ran(Xi) from ^ and (fTOj). 

This will be done with the two steps procedure as suggested by the form of the inequality 
(!22|) . This approach is in the line with the pioneering analysis of the relative sensitivity of 
the eigenvalues of a positive-definite matrix pair from [2]. 

The road- map for the prof is outlined in the following list. For each preparatory step we 
will prove a theorem to justify the procedure and at the end we will combine the conclusion 
in the main theorem. The preparatory steps can be classified as follows: 

(1) H perturbed, M unchanged 

(23) X*HX = A, X*MX = J, X*HX = A, X*MX = J, 

where A = diag (Ai, . . . , A„) , A = diag (Ai, . . . , A„) , Aj, A, G M, for i = 1, . . . , n. 
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1^22 



(21) 



sinBif (Xi, Xi) 



Pxi — P^ 



(2) M perturbed, H unchanged 

(24) X*HX = A, X*MX = J, X*HX = A, X*MX = /, 

where A = diag (Ai, . . . , A„) , and A = diag (Ai, . . . , A„) , and Aj, Aj G M, for i = 
l,...,n. 

The main tools in our analysis will be sharp estimates for the solution of the structured 
Sylvester equations from [131 Lemma 2.4] and fiSi Lemma 2.3]. That is, we consider the 
structured Sylvester equation^ 

(25) AX-XB = A^/^CB^/^ 

(26) AX-XB = CB. 

3.1. The first step. Now we will state our first theorem. We will use the notation and the 
conclusions of Lemma 12.11 without further comments. 

Theorem 3.1. Let {H,M) be a Hermitian pair defined by ([^ and let {H,M) be perturbed 
pair defined by 

{H + SH)x = XMx. 

Let X = [Xi X2] and X = [Xi X2] , be non-singular matrices which simultaneously 

diagonalize the pairs {H,M) and{H,M), asm By setting = \\H-^/^{H-H)H-'^/'^\\ 
we have 

(27) ||,i„e„(X,.X.)||<j^. 

where 

(28) RelGap = min -^^^^^^^ A2 = diag (A^+i, . . . , A„), Ai = diag (Ai, . . . , A^) . 
Proof. Since, according to ( ITSl) . 

sineM(Xi,Xi) = X*MXi, 
we have to bound HXgMXiH. By the definition we have X*HX = A, and so one can write 

(29) H^/^X = Uk^/^ , 

where U = \Ui t/2] = H^^'^XAr^/'^ is unitary and has the block structure conforming to the 
structure of X. A similar identity also holds for perturbed quantities. On the other hand, 
for perturbed quantities it also holds 

{H + 6H)Xi = MXiAi, 

where Ai = diag (Ai, . . . , A,.), and similarly for unperturbed quantities. We multiply the 
above equality by Xg from the left, and get 

X;HXi - X*MXiAi = -X*5HXi . 



^The solution of ((25|) is presented in (TSj Lemma 2.4]. This equation has also been analyzed in infinite 
dimensional setting in f9j. The equation (f26| has been analyzed in 13,; Lemma 2.3], see also [12] . 
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Using the fact that HX2 = MX2A2, this identity can be transformed into 

(30) A2x;mXi - x;mXiAi = -x;6hXi . 

We will proceed by rearranging the right-hand side of (1301) . For that purpose note that one 
can rewrite the right-hand side of (I5U]) as 

(31) x;6HXi = x;h^^'^h-^^^6hh-^/^h^/^Xi , 

which together with fl2^ gives 

The above equality and (!30|) give 

(32) AsX^MXi - X*MXiAi = -Ky^U;H-'^''^5HH-^/^Uil]'^ . 



This identity can be recognized as the structured Sylvester equation from fl25|l . This equation 
is even meaningful when H and M are unbounded operators. In this setting it is called the 
weak Sylvester equation and it has been analyzed in 

Applying [131 Lemma 2.4] to obtain the bounds on the solution of the structured Sylvester 
equation (see also [12]), on f l5^ one gets, see ffT7|) : 

vpl'-ll lA - A I 

(33) ||X*MXi|| < — where RelGap = min ' ' ^' 



RelGap 



for any unitary invariant norm 



A,eA2 ,A,gAi 



3.2. The second step — the change in scalar product. Here we will derive the upper 
bound for the sines of the canonical angles between the eigenspaces Xi = Ran(Xi) and 
Xi = Ran(Xi) induced by weighted M-inner product, defined by 

(34) sin Qm&u Xi) = X^MX^Y-^-^* , 

where 



(35) 





" 




'v* 
-'11 


V* ' 

^21 


Y21 


Y22 







V* 
^22_ 



I - X*6MX . 



Remark 3.2. Note that one of possibilities to chose Yn in l[35\) can be obtained by Block 
Cholesky elimination applied on the right-hand side in l[35\) . This choice yields the block 



Yu = ^^I-Xl5MXi. 

We now consider the problem of the perturbation of the matrix pair {H,M) to {H,M). 
The following theorem contains the upper bound for the ||X2MXi||, where || ■ || stands for 
any unitary invariant norm. 

Theorem 3.3. Let {H, M) be a Hermitian pair and let (if, M) be perturbed pair defined by 

{H + 6H)y = XMy. 
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Ai A2 

a a + 5 

Figure 1. Spectral configuration for Theorem 13.31 

Let X = [Xi X2] o,nd X = [Xi X2] , be non-singular matrices which simultaneously 

diagonalize the pairs {H,M) and {H,M), as in [24\ )- If 

(36) ||A2||<a and || A^^ ||"^ > a + 5 or 

(37) \\A2^\\'^ >a + S and ||Ai|| < a 

where A2 = diag (A^+i, . . . , A„), Ai = diag (Ai, . . . , A^), see FigureUl then 



(3 



x;mXi 



< 



RelGapp 

Here we have used \I/m = || M~^/^(M — M)M~^^'^\\ and for all 1 < p < 00 and we have 
(39) — ^— : > min — ^— '^^^ , , =: RelGap„ . 



Proof. For the perturbed quantities it holds that 

{H + 6H)Xi = MXiAi, 

where Ai = diag (Ai, . . . , A,), and similarly for the unperturbed quantities. Now, by multi- 
plying the above equality by X2 from the left, we get 

x;hXi - x;mXiAi = o . 

Using the fact HX2 = MX2K2 (see ^) this gives 

(40) K2X2MXi - X;MXiAi = -X*5MXiki . 



We will proceed by rearranging the right-hand side of fHOl) . For that purpose note that 
one can rewrite the right-hand side of fHOl) as 



(41) X;6MXi = X*M^I'^M-^''^5MM-^''^M^''^XiKi . 

Recall, that from fl2I|) it follows that Q*2 = X^M^/"^ and Qi = M^/'^Xi have unitary columns, 
which together with ( HTl) gives 

Applying Lemma 2.3] to obtain the bounds on the solution of a structured Sylvester 
equation (see also P^), on (1521) one gets: 

(42) \\X;SMXi\\ < II M-i/2^MM-i/2 II 

RelGapp 

9 



where || ■ || stands for any unitary invariant norm, and RelGap^ is defined as in (139|) . Now 
from fH21) directly follows bound 



3.3. The main result. As we have mentioned in our road-map, form fl22p follows that the 
upper bound for 

II sineAf(Xi, Xi)|| 

will be obtained as the sum of the bounds for || sin6M(Xi, Xi)|| and || sinjv/ 0(Xi, Xi)||. Thus 
we have the following theorem: 

Theorem 3.4. Let {H, M) be a Hermitian pair and let {H, M) be the perturbed pair. Let X = 
[Xi X2] and X = [Xi X2] ; be non-singular matrices which simultaneously diagonalize the 
pairs {H,M) and {H,M), as in ^ and l[W\) . respectively. If 

TIM := \\M-^l^5MM-^'^\\2 < i, 

and if ^EE) or hold, then 

(43) |l.ine„(X,X01| < ■ + • • *« . 

where \\ sin 0m(Xi, Xi)|| — the sine of the angle between the subspaces in M scalar product 
— is defined by l[21\) . and RelGap and RelGap^ are defined by (d^j and ^3^, respectively. 

Proof. Using f l22|) . (!27|) . and (!38|) and the multiplicative properties of unitary invariant 
matrix norms one gets 

(44) ||smeM(X.,XOII < • + • *M ■ Pu'lh. 



where Yu = y I — X^dMXi is defined as in ( 135|) . It left us to compute the bound for 

||^n^||2- Using the M-orthogonality of X it can be easily seen that X and M^^^'^(I + 
M'^^'^SMM^^^'^)^^^'^ are unitarily similar, that is that exists unitary matrix Q such that 

(45) X = M-^'^{I + M~i/2(5MM-i/2)~^/2g 

Now we can proceed, note that 

/ ~ ~ N -1/2 1 1 

(46) \\(l-Xl5MXA II2 < , < 



1 - ||Xi*5MXi||2 y 1 - ||X*5MX||2 
Set W = M-^/^5MM-^/'^, then from <^ follows 

(47) ||X*5MX||2 = + WY^'^W{I + W)-^/H2 <-^^. 



Finally inserting ( I47j) in (146|) one gets 

(48) \\{l-Xl5MX,)-"^\\< ^^^^ 



Now, insert (HSj) in to get f l43p . which completes the proof. 
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An alternative version — that is to say a version where alternative relative perturbation 
sizes feature — can be obtained using Lemma 12.11 



Corollary 3.5. Under the assumptions of Theorem\3.4\ we have the estimate 



(49) ||sineM(Xi,Xi)|| < ^ <^H . I $M 



RelGap \^T^^t]h RelGapp y/1 — 2 rjM 

where $m = \\M-^/'^5MM-^/^\\ and = \\H-^/^HH-'^/^\\. 

3.3.1. Weakening the assumption on the spectral dichotomy. Note that theorem 13.31 requires 
the special structure on specters of Ai and A2. The reason for this lies in the more involved 
analysis of the structures Sylvester equation (!26|) . see the comment in the introduction to 

This limitation can be overcome by the use of the Frobenius norm instead of spectral norm. 
Thus, the next theorem contains the perturbation bound similar to the one from Theorem 



13.31 given for 



the pair {H, M) 



X*MXi 



, without any additional assumptions on spectral configuration of 

F 



Theorem 3.6. Let {H,M), {H,M), X = [Xi X2] and X = [Xi X2], be as in Theorem 
El Then 



(50) 



x;mXi 



l-F 



F RelGap,„j„p 

where we remember the definition ^E'^j''^ = \\M^^/^6MM^^/^\\f from (11) and we assume 



that 

(51) RelGap^o^p := min 



A,eAi 



is strictly larger than zero. 



Proof. The first part of the proof is similar to the proof of theorem 13.31 up to the equality 
(HTl) . Thus we continue the proof from there, that is one can write: 

(52) A2X*MXi - X*MXiAi = -X*5MXiAi , 
and 

(53) X*5MXi = QlM-^''^5MM-^''^Qi , 

where Q*2 = XgM^/^ and Qi = M^/^Xi have unitary columns. 
By interpreting ( |52l) and ( |53l) component-wise if follows 



(A2UX;MX,),, - (X*MXi),,(Ai),, = -{QlM-'/^6MM-'/^Q,),,{A,),, 

or 

(54) (X*MXi)., = ((Q2)(:,)M-^/^<5MM-V2(Q,)(^^^.^) , 

(A2jii - 

where {Q){:,j) denotes j-th column of the matrix Q. 
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By computing the Frobenius norm from (15^ we have 

n n 

(55) \\X;MX,\\1= J2 E I,,, r.. 2 ((4)(:,)M-^/^^MM-V^(gi)(:.) 

( At ),-,— (Ai l-i-j V 



(Ai) 



33 



i=k+l i=k+l 

which gives 

(56) ||X2*MXi||^ <Tn7^ \\Q*M-'/^5MM-'^^Qi\\f. 

KelGapcojjjp 

Now from fl56l) . noting that Q and Q are both unitary, we obtain (|50|) . ■ 
We can now give a Frobenius norm version of Theorem I3.4[ 

Theorem 3.7. Let {H, M) be a Hermitian pair and let {H, M) be the perturbed pair. Let X = 
[Xi X2] and X = [X^ X2] , be non-singular matrices which simultaneously diagonalize the 

pairs {H,M) and {H,M), as in ^ and / flOj) . respectively. If the spectra are separated so 
that RelGapj,ojnp > then 

(57) II sine„(X.. XOIk < 5^ . + . . 

where \\ sin 0m(Xi, Xi)||ir — the sine of the angle between the subspaces in M scalar product 
— is defined by /[21\) . and RelGap and RelGap^^^ are defined by (d^j and 137]) . respectively. 

4. Numerical examples 

It is not easy to numerically compare the eigenvector estimates for the perurbations of ma- 
trix pencils. The reason is that there is no canonical norm for the analysis of the eigenvector 
problem. For instance, assume that we have a positive definite symmetric pencil {H,M), 
then any of the matrix dependent norms (and the associated scalar products) 

|||x|||q,^;3 = a/ a x*Hx + /3 a;*Mx, a > 0, /3 > 0, and a/? 7^ 

is a meaningful candidate as well as is the standard Euclidean norm || ■ ||. 

Applying any of the competing estimates in a situation for which they were not designed, 
is only possible after a nontrivial intervention which often severely affects the sharpness of 
the result. To this end we use the same set of problems for various approaches and compare 
them by comparing how well they are doing a job they were designed for. More to the point, 
the estimate of the type Left(K) < Right (k) — where k G M is a parameter — is considered 
asymptotically sharp if 

(58) lim = 1. 

K-s>cx) Right (k) 

Such property of an estimator is sometimes called the asymptotic exactness of an estimator, 
see [H] and relation flS5|) below. 

Remark 4.1. // we were to adopt the philosophy of [10], we would consider the family 
of eigenvalue problems {H, aH + /3M) — assuming a, /3 G M are such that aH + /3M is 
hermitian positive definite — and ask for such a and /3 which are in some sense optimal. 
We cannot give an answer to the question of the choice the optimal energy norm now, but 
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we might return to the question in future work. Instead, we note that given a, /3 G M such 
that aH + /3M and H are Hermitian positive definite reduces the problem to the one we can 
handle. The eigenvalues Aj of the matrix pair {H, M) and A"'^ of the pair {H, aH + (3M) 
are related by the transformation fii = Xi/{aXi + f3i). 

4.1. Perturbations of eigenspaces in the energy norm. We will now use the theory 
form the preceeding section to study the rotation of eigenvectors of a parameter dependent 
family of eigenvalue problems 



K > 1. 



Here iff, is positive definite, is a positive semi-definite matrix and we are interested in 
the estimate of the rotation of eigenvectors in the changing energy norm 



For some further motivation for studying these problems see the Appendix 
To this end we note that eigenvector problems 



(59) 
(60) 



-f^K^ = Af , V = —Hf^v, V = XH^ 
X 



have the same eigenvectors. Furthermore, it is known, [TJ [16] that as k tends to infinity the 
eigenvalues of if^ either tend to infinity or, they converge to the nonzero eigenvalues of 



Lb '■— -pKer(He)-^K 



Kcr(/fe) 



Subsequently, we decompose the space M*^ = Ker(ife) © (Ker(ife))"'" and, without reducing 
the level of generality — see [Ml Formula (12)] — think of if^ as the block operator matrix 



(61) 





R-b 


+ K 


'0 


" 


Rfy 


Wb_ 





He 



We also denote the block diagonal of H^^ with 

\Lb 



Wb + «:ife 



and compute 

\\D-"\D^-H^)D-y^\ 



1 





{Wb + KEBY^'^RbLl 




1/2 



Ll^'^RliWb + nEB) 




-1/2 



■mb + EB)-^/^RbLl''' 



L-b"'Rl{\Wb + Esr'/' 
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Let us introduce the perturbation estimate rjn^ 
note the following inequalities 



-1/2, 



With this we 



(62) 
(63) 



\x*H-^x - x*D-^x\ < 



x*D-^x. 



Obviously, with this analysis we can chose 
(64) Vh,^ 
and so we can apply Theorem 13.41 directly. 



Remark 4.2. This discussion indicates that it is easy, within this theory, to switch the roles 
of H and its inverse H^^. This is so because an estimate on the perturbation of the one 
implies the relative estimate for the perturbation of the other. A similar feature is shared by 
the relative gap from ( [57| ) since 



lA 



It is pleasing and useful — when switching the roles of H and M — that both ingredients of 
an estimate like ([5^ are robust with respect to inversion of the eigenvalues. 



For first simple experiments we consider the family of problems 

-1 



(65) 



2-10 
-1 2 -1 
-1 2 + K 



2 

-1 2 

-1 







-1 
2 -1 

-1 2 + K 



ft > 1. 



In the first experiment we will see how do the ingredients of the estimates — relative gap and 
the residual — feature in their performance. 

By Af < Af < Af we denote the eigenvalues of and by Af < A^'= < A^-^ < A?'= the 
eigenvalues of H^. We also use for eigenvectors the following notation 



1,2,3, 
1,2,3,4. 



The behavior of the family of problems dHS]) has been analyzed in [TB] with the help of 
the Gerschgorin theorem. Let us consider the eigenspace which belongs to the eigenvalues 
Af^" < A^'' and A'f" < A^". to this end we write the implicit partial diagonalization of if^ 
and Hk in the generic block matrix form 



Lh Rl 



where Lb, Wb, Rb and He are as in f l6T]) and is the diagonal matrix containing the targeted 
eigenvalues. The orthogonality property V*Vf^ + W*Wf^ = I together with the Gerschgorin 
theorem implies, see [Ml pg. 3209], the estimates 



(66) 



|^feK-KA«|| =0(i), ||V;*K-/| 

K 
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In the example that follows we show this explicitly on the model problem and indicate a 
possible dependence on n of the otherwise unaccessible matrix V^- 



Example 4.3. In this example we show that the estimates are asymptotically sharp — for 
the definition of this notion see (158|) below and reference [8] for a discussion of its significance 
in finite element computations — for the matrix H^. For this problem we have for eigenvalues 
and eigenvectors 

^. . 1 3 55 1 ^/1 
A? = l \ ^ 7 H r + O — 

1 _ 3 55 1 (I 
^2-^ 2k 8k2 + 128«:4 + 2/s:5+^1^«:6 

K K° \ K^ 



1 + ^ + ^ L + + ^ - 



675 



2ft ' ,8k2 _ 2k3 '^128k^ _'_ 2k5 . 1024k6 (k'^) 
128k5 



1 _L i _L _t ^ J 55 ]_ A_ n (J_] 



-1 + ^ - A 

^ 2k 8]? 



1 

2k^ 



1 



7 



128k4__' 2kS 



675 
1024k': 



1 _ i + ^ + ^ 55 ^^Q(J_) 

^ K ^ 2k2 ^ 8k3 128k5 2k6 ^ ^ \ k.'^ ) 



1 



and riH, 



-^T^. Note that the matrix 

6+3k 



has columns given by and and 



so we can see the dependence K; on the penalty parameter in this example explicitly. Using 
(EH) we obtain 



Rights : = 



1 



+ 



1 



K 



RelGap ^\ — f]H~^ RelGapp a/1 — 2 rjn, 
On the other hand, a simple computation and Theorem 13.41 yield, cf. fl66|) that 

Left^ := sinej/^(Ran[< v^], Ran[v^ v^]) = 0{^). 

Here we have used the symbol vf°, 

as K — > oo. They are also the eigenvectors of the limit matrix 



1, 2, 3 to denote the limit eigenvectors of , i = 1, 2, 3 



2-10 
-12 




This shows that the energy norm estimate is sharp when viewed as the function of k. On 
the other hand a simple computation reveals that any of the sin G theorems from [5l |9l [13] 
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Figure 2. Numerical experiment for Example 14.31 The experiment demon- 
strates the notion of the asymptotic sharpness. In this plot we have depicted 
the effectivity quotient against the penalty parameter, see Example 14. 31 for the 
definition 



yields a similar O(^) — or even wors (|for the 0(1) — upper estimate for the 



sine(Ran[< <], Ranlv^ <]) = O(-) 

K 



We now turn our attention to the study of the asymptotic sharpness — in the sense of fl58p 
— of our estimates on concrete examples This can be proved by direct computation for the 
case of our estimate applied to the matrix pairs {H~^,Hf^) and cf. Example 14.41 for 

further discussion. This shows that a notion of sharpness — a sin B theorem is considered to 
be sharp if there is a perturbation in the allowed class of perturbations such that the bound 
is attained — for the estimates of the rotation of eigenvectors is a delicate question. Let us 
note that we will call ^^^^ the effectivety quotient. 

Example 4.4. In this example we perform a Matlab experiment in which we evaluate the 
estimate of Corollary 13.51 for the matrix pairs 

The results are presented on Figure [31 The results further illustrate the delicacy of the issue 
of the sharpness of sin B theorems. Namely, the estimates are not asymptotically sharp for 
any of the considered matrix pairs, but the energy norm estimates — that is estimates for the 
pairs (/, Mf.) and (EI~^, EI«;) — are of the same order of the magnitude as the error — this can 
be seen from the fact that the effectivity quotients converge to a constant — where es in the 
case of the estimates for the other norms the effecivity quotients converge to zero. These 



^The residual estimate (|66| gets spoilt when we chose the orthonormal basis for Ran[i;5^ v^] as the 
columns of Vk are not orthonormal. 
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Figure 3. Numerical experiment for Example 14.41 In this plot we have de- 
picted the effectivity quotient against the penalty parameter, see Example 14.31 
for the definition. 

convergence claims can be verified by a direct symbolic computation. This example shows 
that both the choice of a measure of the spectral gap as well as the choice of the measure of 
the residual play a role in obtaining high performance estimators, since it was the influence 
of the measure of the relative gap which guaranteed the asymptotic sharpness in Example 
14. 3[ compare Figures |2] and [31 

4.2. A Matrix Market example. For a further illustration of an effect similar to mass 
lumping we will consider the generalized eigenvalue problem 

Hx = XMx, 

where the matrix H is taken from the Matrix Market basis, see [M] . We choose H from the 
set CYLSHELL: Finite element analysis of cylindrical shells matrices. From this test set we 
took the matrix slrmq4nil .mtx which is real symmetric positive definite, 5489 x 5489 matrix 
with 143300 entries. This matrix is obtained by finite element discretization of an octant of 
a cylindrical shell. The ends of the cylinder are free. 

For the matrix M we took diagonal matrix with — in Matlab notation — disg(l : n) and 
we consider random perturbations 6H and SM, which satisfy 

I (SH) ij \<T]H\Hij\, I (SM) ij I < r]M I Mij \ , 

where rju = rjM = 10~®. The above assumption means that zeros remain unperturbed and 
we have chosen the M matrix whose norm explodes as n — )■ oo. This is a reasonable choice 
for our method, since the technique of our proof can readily be adapted to yield the same 
result for some unbounded pair of operators in a Hilbert space. 

As a comparison we consider one of the well known the standard perturbation bound for 
matrix pairs is given by the theorem of Stewart and Sun from [THl Chapter VI] . To this end, 
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let {H, M) be a symmetric definite pair, such that ([9]) holds. That is, let X = [ Xi X2 ] 
be such that 



(67) 
where 



X* 



H[X, X2] 



"Ai 




"X*" 


^2. 




x;_ 



M[Xi X2] 



i-n—k 



Ai = diag (Ai, . . . , A^), ^2 = diag (A^+i, . . . , A„), 



and Xi G C"^'^, X2 G C"^"~'^. The following theorem contains a bound for the Frobe- 
nius norm of the diagonal matrix which contains the sines of the canonical angles between 
eigenspace Ran(Xi) and corresponding perturbed eigenspace !3^(Xi). 

Theorem 4.5 (Sun). Let the definite pair {H, M) be decomposed as in [Ul\ ) where Xi and X2 
have orthonormal columns. Let the analogous decomposition be given for the pair {H, M) = 
{H + SH,M + 6M). If 



lA- AI 



mm 



;A G ^(Ai),A G ^(Aa) 



then 



(68) ||sine[Ran(Xi),Ran(Xi)]||j. < 



^\\H^ + M^ ^\\5HXi\\l + \\6MX^\ 



where 
(69) 



-f{H,M)-f{H,M) d 
7(if, M) = min \x*{H + iM)x\ = min ^/{x*Hxy + {x*MxY > 0. 



EgC" 



11x11=1 



We estimate the perturbation of invariant subspace which corresponds with first four 
smallest eigenvalues of the matrix pair {H,M). The experiment is to be understood in the 
context of the testing of the asymptotic sharpness of the estimator as in the definition ( 158|) . 

Example 4.6 (The performance of our estimate). The exact perturbation gives: 

||sineA/(Xi,Xi)|| ^ 6.727- 10~^ 

while our bound ( H3l) gives 

||sineA/(Xi,Xi)|| < 8.6721 ■ 10"^ 

Example 4.7 (The performance of the Stewart-Sun bound). The bound ( !68l) here is not 
satisfactory due the fact that 7(i^, M) = 1, and -fiH + 6H, M + 6M) + On the other 
hand the gap S ~ 10"^ and y^\\H'^ + M'^\\ ~ 10^ Together with 

^\\dHXi\\l+ \\6M X^Wl = 3.872 ■ 10"^ , 

we have 

II sine[Ran(Xi),Ran(Xi)]||F < 6 ■ 10^ 
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Appendix A. A motivation to study the problems of the large coupling 

LIMIT 

Consider positive definite eigenvector problems of the following type: find ||'0|| = 1 and 
A e R such that 

(70) H^^P = //ftV' + «^eV' = AV', 

where if^ is positive definite matrix and H^. is a semidefinite perturbation which has a 
significant null space and k ^ 1. The presence of a large coupling constant k the singular 
perturbation causes the appearance of spurious, that is nonphysical, eigenvalues due to 
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the non-zero component of He- It is our aim to obtain bounds on the rotation of eigenspaces 
which is caused by this perturbation. 

When considering the famihes of matrices/operators hke = Hb + nHe, k ^ 1. the 
parameter k is called the coupling — or depending on the context the penalty — parameter. 
The family of perturbations nHg splits the spectrum of if^ into a bounded and an unbounded 
component as k — > oo. 

One typical example of a problem in this setting are the penalty methods for Maxwell or 
Stokes' eigenvalue problems. For more information and further references see [16]. There, 
the authors analyze the dependence of the spectrum of the discretization matrix of the 
Maxwell's eigenvalue problem on the large coupling parameter and show — by a very elegant 
Gerschgorin type argument — that as k — oo the eigenvalues of interest converge with the 
rate proportional to k,~^. 

Let us note that the models where one considers the limits of the large penalty are repre- 
sentative for a larger class of parameter dependent singularly perturbed eigenvalue problems. 
These problems typically appear in the study of optical nano-devices, hard core scattering 
theory and in the analysis of lower dimensional approximations to the 3D elasticity (like 
Arches and Plates), see [3l HJ [6l [7]. Another example is the so called "lumped mass ap- 
proximation" in which an auxiliary diagonal mass matrix M is constructed which generates 
an equivalent scalar product. Such matrices are typically constructed by using quadrature 
formulae and pseudo projections, see [1]. The analysis from [31 [7] shows that the eigen- 
value estimates form [K] are sharp when viewed in terms of the dependence on the coupling 
constant, cf. Example 14.31 |^ 
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'Explicit constants and their physical interpretations are explicitly given in |3l [7] . 
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